ndex for using different partitioning rules for the data shown in

37(a). It can be seen that they indeed reach the optimal (the

m Gini index and the maximum information gain) when the

ng rule is y = 2 for the data shown in Figure 3.37(a).

he classification and regression tree algorithm

ting a classification tree model is based on a series of optimisation

s, each of which is implemented by measuring the value of a

impurity index aforementioned for each partitioning rule. The

lgorithm can be used for both classification analysis and

n analysis. It works in three steps [Breiman, et al., 1984].

first step, a tree is grown for a data set based on the recursive

ng of a data space using a series of partitioning rules. A decision-

ode is created in a tree for every partitioning rule. Every node is

d of two parts. One is a selected variable (such as x) and the other

cted threshold (such as T). The selection of T is based on the

tion of the Gini index or the maximisation of the entropy index

rmation gain) among many candidate partitioning rules. A logic

is used to combine them to generate a logic expression for

g a partitioning rule, such as x < T. A data space, which is either

al space or a previously partitioned subspace, is called a working

r testing a newly generated partitioning rule. Suppose a

ng rule is defined as a logic expression, i.e., x < T. A working

divided into two subspaces by testing this logic expression. One

ubspaces is composed of data points which satisfy this logic

n, i.e., x < T. The other subspace is composed of the rest of data

hese data points violate this logic expression, i.e., ݔ≮ܶ or ݔ൒

purity is evaluated for each newly partitioned subspace and hence

de. If the Gini index is zero, no further partition is required from

node. Each node below this node is thus labelled according to the

perty of the data points in the corresponding subspace. Such a

h a labelled class is called a leaf in a decision-making tree.